\hypertarget{chapter-6-some-more-on-strings-and-arrays-of-strings}{%
\subsection{CHAPTER 6: Some more on Strings, and Arrays of
Strings}\label{chapter-6-some-more-on-strings-and-arrays-of-strings}}

Well, let's go back to strings for a bit. In the following all
assignments are to be understood as being global, i.e. made outside of
any function, including main().

We pointed out in an earlier chapter that we could write:

\begin{minted}{c}
   char my_string[40] = "Ted";
\end{minted}

which would allocate space for a 40 byte array and put the string in the
first 4 bytes (three for the characters in the quotes and a 4th to
handle the terminating \textbf{'\textbackslash0'}).

Actually, if all we wanted to do was store the name "Ted" we could
write:

\begin{minted}{c}
   char my_name[] = "Ted";
\end{minted}

and the compiler would count the characters, leave room for the nul
character and store the total of the four characters in memory the
location of which would be returned by the array name, in this case
\textbf{my\_name}.

In some code, instead of the above, you might see:

\begin{minted}{c}
   char *my_name = "Ted";
\end{minted}

which is an alternate approach. Is there a difference between these? The
answer is.. yes. Using the array notation 4 bytes of storage in the
static memory block are taken up, one for each character and one for the
terminating nul character. But, in the pointer notation the same 4 bytes
required, \textbf{plus} N bytes to store the pointer variable
\textbf{my\_name} (where N depends on the system but is usually a
minimum of 2 bytes and can be 4 or more).

In the array notation, \textbf{my\_name} is short for
\textbf{\&myname{[}0{]}} which is the address of the first element of
the array. Since the location of the array is fixed during run time,
this is a constant (not a variable). In the pointer notation
\textbf{my\_name} is a variable. As to which is the \textbf{better}
method, that depends on what you are going to do within the rest of the
program.

Let's now go one step further and consider what happens if each of these
declarations are done within a function as opposed to globally outside
the bounds of any function.

\begin{minted}{c}
void my_function_A(char *ptr)
{
    char a[] = "ABCDE"
    .
    .
} 


void my_function_B(char *ptr)
{
    char *cp = "FGHIJ"
    .
    .
}
\end{minted}

In the case of \textbf{my\_function\_A}, the content, or value(s), of
the array \textbf{a{[}{]}} is considered to be the data. The array is
said to be initialized to the values ABCDE. In the case of
\textbf{my\_function\_B}, the value of the pointer \textbf{cp} is
considered to be the data. The pointer has been initialized to point to
the string \textbf{FGHIJ}. In both \textbf{my\_function\_A} and
\textbf{my\_function\_B} the definitions are local variables and thus
the string \textbf{ABCDE} is stored on the stack, as is the value of the
pointer \textbf{cp}. The string \textbf{FGHIJ} can be stored anywhere.
On my system it gets stored in the data segment.

By the way, array initialization of automatic variables as I have done
in \textbf{my\_function\_A} was illegal in the older K\&R C and only
"came of age" in the newer ANSI C. A fact that may be important when one
is considering portability and backwards compatibility.

As long as we are discussing the relationship/differences between
pointers and arrays, let's move on to multi-dimensional arrays.
Consider, for example the array:

\begin{minted}{c}
    char multi[5][10];
\end{minted}

Just what does this mean? Well, let's consider it in the following
light.

\begin{minted}{c}
    char multi[5][10];
\end{minted}

Let's take the underlined part to be the "name" of an array. Then
prepending the \textbf{char} and appending the \textbf{{[}10{]}} we have
an array of 10 characters. But, the name \textbf{multi{[}5{]}} is itself
an array indicating that there are 5 elements each being an array of 10
characters. Hence we have an array of 5 arrays of 10 characters each..

Assume we have filled this two dimensional array with data of some kind.
In memory, it might look as if it had been formed by initializing 5
separate arrays using something like:

\begin{minted}{c}
    multi[0] = {'0','1','2','3','4','5','6','7','8','9'}
    multi[1] = {'a','b','c','d','e','f','g','h','i','j'}
    multi[2] = {'A','B','C','D','E','F','G','H','I','J'}
    multi[3] = {'9','8','7','6','5','4','3','2','1','0'}
    multi[4] = {'J','I','H','G','F','E','D','C','B','A'}

\end{minted}

At the same time, individual elements might be addressable using syntax
such as:

\begin{minted}{c}
    multi[0][3] = '3'
    multi[1][7] = 'h'
    multi[4][0] = 'J'
\end{minted}

Since arrays are contiguous in memory, our actual memory block for the
above should look like:

\begin{minted}{c}
    0123456789abcdefghijABCDEFGHIJ9876543210JIHGFEDCBA
    ^
    |_____ starting at the address &multi[0][0]
\end{minted}

Note that I did \textbf{not} write \textbf{multi{[}0{]} = "0123456789"}.
Had I done so a terminating \textbf{'\textbackslash0'} would have been
implied since whenever double quotes are used a
\textbf{'\textbackslash0}' character is appended to the characters
contained within those quotes. Had that been the case I would have had
to set aside room for 11 characters per row instead of 10.

My goal in the above is to illustrate how memory is laid out for 2
dimensional arrays. That is, this is a 2 dimensional array of
characters, NOT an array of "strings".

Now, the compiler knows how many columns are present in the array so it
can interpret \textbf{multi + 1} as the address of the 'a' in the 2nd
row above. That is, it adds 10, the number of columns, to get this
location. If we were dealing with integers and an array with the same
dimension the compiler would add \textbf{10*sizeof(int)} which, on my
machine, would be 20. Thus, the address of the \textbf{9} in the 4th row
above would be \textbf{\&multi{[}3{]}{[}0{]}} or \textbf{*(multi + 3)}
in pointer notation. To get to the content of the 2nd element in the 4th
row we add 1 to this address and dereference the result as in

\begin{minted}{c}
    *(*(multi + 3) + 1)
\end{minted}

With a little thought we can see that:

\begin{minted}{c}
    *(*(multi + row) + col)    and
    multi[row][col]            yield the same results.
\end{minted}

The following program illustrates this using integer arrays instead of
character arrays.

-----------  Program 2.1  -----------------------------------
\inputminted{c}{../src/ch2-1.c}
--------- end program 2.1 -----------------------------------

Because of the double de-referencing required in the pointer version,
the name of a 2 dimensional array is often said to be equivalent to a
pointer to a pointer. With a three dimensional array we would be dealing
with an array of arrays of arrays and some might say its name would be
equivalent to a pointer to a pointer to a pointer. However, here we have
initially set aside the block of memory for the array by defining it
using array notation. Hence, we are dealing with a constant, not a
variable. That is we are talking about a fixed address not a variable
pointer. The dereferencing function used above permits us to access any
element in the array of arrays without the need of changing the value of
that address (the address of \textbf{multi{[}0{]}{[}0{]}} as given by
the symbol \textbf{multi}).

\begin{comment}
\href{ch7x.htm}{Chapter 7: More on Multi-Dimensional Arrays}

\href{pointers.htm}{Back to Table of Contents}
\end{comment}
